Imagine "walking" from the tip of v to the line in an orthogonal fashion. Since the line is the span of some vector l={c⋅s∣c∈R}, we're looking for some cp so that cps is orthogonal to v−cps
To solve, notice that v−cps is orthogonal to s itself, so s⋅(v−cps)=0→s⋅v−cps2=0→cp=s⋅ss⋅v
Thus, we have the orhtogonal projection of v onto l=[s] is proj[s](v)=s⋅sv⋅s⋅s
This vector w=proj[s](v) is the only vector in the line [s] such that v−w is orthogonal to any vector in [s]
Example 12.1
The projection of this R3 vector into the line v=⎝⎜⎛311⎠⎟⎞L={c⋅⎝⎜⎛1−21⎠⎟⎞∣c∈R}
is the vector projLv=⎝⎜⎛1−21⎠⎟⎞⋅⎝⎜⎛1−21⎠⎟⎞⎝⎜⎛311⎠⎟⎞⋅⎝⎜⎛1−21⎠⎟⎞⋅⎝⎜⎛1−21⎠⎟⎞=62⎝⎜⎛1−21⎠⎟⎞=⎝⎜⎛1/3−2/31/3⎠⎟⎞
Gram-Schmidt Orthogonalization
Notice how v can be decomposed into v=proj[s](v)+(v−proj[s](v))
These are orthogonal, and can be seen as "non-interacting," i.e. linearly independent
Vectors v1,...,vk∈R are mutually orthogonal if any pair of them are orthogonal
i.e. for any i=jvi and vj are orthogonal
For example, the standard basis vectors are mutually orthogonal
If the vectors in a set {v1,...,vk}⊂Rn are mutually orthogonal and nonzero, then the set is linearly independent.
Proof
Consider c1v1+⋯+ckvk=0. For i∈{1,...,k}, taking the dot product on both sides gives vi⋅(c1v1+⋯+ckvk)=vi⋅0⟹ci(vi⋅vi)=0
Since vi=0, we must have vi⋅vi=0, therefore ci=0.
Since all ci=0, the set is linearly independent.
A corollary of this is that k mutually orthogonal vectors of a k-dimensional vector space is a basis, because a subset of k linearly independent vectors of a k-dimensional space is a basis.
An orthogonal basis for a vector space is a basis of mutually orthogonal vectors
Gram-Schmidt Orthogonalization
If ⟨β1,...,βk⟩ is a basis for a subspace of Rn, then the vectors κ1κ2κ3κk===⋮=β1β2−proj[κ1](β2)β3−proj[κ1](β3)−proj[κ2](β3)βk−proj[κ1](βk)−⋯−β3−proj[κk−1](βk)
form an orthogonal basis for the same subspace. Moreover, span(κ1,...,κi)=span(β1,...,βi) for all i=1,...k
Proof
We use induction to show that each κi:
is nonzero
is in the span of ⟨β1,...,βi⟩
is orthogonal to all preceding vectors
Then by the previous corollary, ⟨κ1,...,κk⟩ is a basis
Case i=1: this is trivial
Case i=2: we have κ2=β2−proj[κ1](β2)=β2−κ1⋅κ1β2⋅κ1⋅κ1=β2−κ1⋅κ1β2⋅κ1⋅β1
This is nonzero because the β's are linearly independent, is clearly in the span ⟨β1,β2⟩, and is orthogonal to κ1 because the projection is orthogonal
Case i=3: we have κ3=β3−proj[κ1](β3)−proj[κ2](β3)=β3−κ1⋅κ1β3⋅κ1⋅κ1−κ2⋅κ2β3⋅κ2⋅κ2=β3−κ1⋅κ1β2⋅κ1⋅β1−κ2⋅κ2β3⋅κ2⋅(β2−κ1⋅κ1β2⋅κ1⋅β1)
This is nonzero and in the span ⟨β1,β2,β3⟩ becuase they are linearly independent, and it is not hard to check that this is orthogonal to κ1 and κ2
Continue in this fashion to prove for all i=1,...,k
Note that if ⟨β1,...,βk⟩ is already orthogonal, the process just gives κi=βi for i=1,...,k
Example 12.2
Derive an orthogonal basis K=⟨κ1,κ2⟩ for the basis B=⟨(12),(13)⟩
First, κ1=β1=(12)
Then, κ2=β2−proj[κ1](β2)=(13)−(12)⋅(12)(13)⋅(12)⋅(12)=(−2/51/5)
Thus, K=⟨(12),(−2/51/5)⟩
Note that because (12)⋅(−2/51/5)=0, they are orthogonal
Example 12.3
Derive an orthogonal basis K for B=⟨⎝⎜⎛112⎠⎟⎞,⎝⎜⎛−121⎠⎟⎞,⎝⎜⎛03−1⎠⎟⎞⟩
κ1=β1=⎝⎜⎛112⎠⎟⎞ κ2=β2−proj[κ1](β2)=⎝⎜⎛−121⎠⎟⎞−⎝⎜⎛112⎠⎟⎞⋅⎝⎜⎛112⎠⎟⎞⎝⎜⎛−121⎠⎟⎞⋅⎝⎜⎛112⎠⎟⎞⋅⎝⎜⎛112⎠⎟⎞=⎝⎜⎛−121⎠⎟⎞−21⎝⎜⎛112⎠⎟⎞=⎝⎜⎛−3/23/20⎠⎟⎞ κ3=β3−proj[κ1](β3)−proj[κ2](β3)=⎝⎜⎛03−1⎠⎟⎞−⎝⎜⎛112⎠⎟⎞⋅⎝⎜⎛112⎠⎟⎞⎝⎜⎛03−1⎠⎟⎞⋅⎝⎜⎛112⎠⎟⎞⋅⎝⎜⎛112⎠⎟⎞−⎝⎜⎛−3/23/20⎠⎟⎞⋅⎝⎜⎛−3/23/20⎠⎟⎞⎝⎜⎛03−1⎠⎟⎞⋅⎝⎜⎛−3/23/20⎠⎟⎞⋅⎝⎜⎛−3/23/20⎠⎟⎞=⎝⎜⎛03−1⎠⎟⎞−61⎝⎜⎛112⎠⎟⎞−9/29/2⎝⎜⎛−3/23/20⎠⎟⎞=⎝⎜⎛4/34/3−4/3⎠⎟⎞
So in summary, K=⟨⎝⎜⎛112⎠⎟⎞,⎝⎜⎛−3/23/20⎠⎟⎞,⎝⎜⎛4/34/3−4/3⎠⎟⎞⟩
The orthogonal basis K can be normalized to have length 1, making it an orthonormal basis
A family of vectors in Rn is orthonormal if they are mutually orthogonal and all have length 1.
In other words, for i∈{1,...k} with i<j, {β1,...,βl}⊆Rn is orthonormal if βi⋅βj=0 and βi⋅βi=1
If it is also a basis, then it is an orthonormal basis
Summary of Gram-Schmidt process
Any subspace of Rn has an orthogonal basis
if BM=⟨b1,...,bk⟩ is an orthonormal basis for subspace M of Rn, then for any vector v∈M RepBM(v)=⎝⎜⎜⎛v⋅b1⋮v⋅bk⎠⎟⎟⎞
or equivalently v=(v⋅b1)b1+⋯+(v⋅bk)bk
Proof
Since BM is a basis for M, we can write v=c1b1+⋯+ckbk with c1,...,ck∈R. To find ci take the dot product with bi so
v⋅bi===(c1b1+⋯+cibi+⋯+ckbk)⋅bic1b1⋅b1+⋯+cibi⋯bi+⋯+ckbk⋅bkci
since bi⋅bj=0 for i=j and bi⋅bi=1
We will say w∈Rn is orthogonal to subspace M of Rn if it is orthogonal to every vector v∈M, i.e. w⋅v=0 for all v∈M
a) The only vector v∈M that is orthogonal to M is 0
b) If w1 and w2 are orthogonal to M, then any c1w1+c2w2 with c1,c2∈R is also orthogonal to M
c) If BM=⟨β1,...,βk⟩ is a basis for M, then w is orthogonal to M iff w⋅βi=0 for all i=1,...,k
Proofs
a) We must have v orthogonal to itself, so v⋅v=∣v∣2=0⟹v=0
b) We have w1⋅v=0 and w2⋅v=0 for all v∈M, so (c1w1+c2w2)⋅v=c1w1⋅v+c2w2⋅v=0
c) If w∈Rn is orthogonal to M, then it is orthogonal to every bi∈M. Conversely, assume w∈Rn is such that w⋅bi=0 for all i=1,...,k.
Any vector v∈M can be represented as v=c1b1+⋯+ckbk, so w⋅v=w⋅(c1b1+⋯+ckbk)=c1w⋅b1+⋯+ckw⋅bk=0
Onto a Subspace
This is a generalization of the projection onto a line.
Let M be a subspace of Rn, then for every vector w∈Rn, there exists a unique vector v∈M such that w−v is orthogonal to M.
We denote v=projM(w) and call it the orthogonal projection of w on M.
If BM=⟨b1,...,bk⟩ is an orthogonal basis for M, then projM(w)=(w⋅b1)b1+⋯+(w⋅bk)bk
Proof
The vector v=(w⋅b1)b1+⋯+(w⋅bk)bk is such that w−v is orthogonal to M. Since v∈M and BM is an orthogonal basis, v=(v⋅b1)b1+⋯+(v⋅bk)bk.
Therefore, v⋅b1=w⋅b1,...,v⋅bk=w⋅bk
This implies (w−v)bi=0 for all i=1,...,k, so by c) from before w−v is orthogonal to M.
Now suppose v1,v2∈M are such that w−v1 and w−v2 are orthogonal to M. By b) from before, (w−v1)−(w−v2)=v2−v1 is orthogonal to M, but v2−v1∈M, so by a) v2−v1=0⟹v2=v1
proving its uniqueness
Let M be a subspace of Rn. The map projM:Rn→M,w↦projM(w) is a linear map.
Proof
We must show that for w1,w2∈Rn projM(c1w1+c2w2)=c1projM(w1)+c2projM(w2)
Both w1−projM(w1) and w2−projM(w2) are orthogonal to M. Therefore, the linear combination of those vectors c1(w1−projM(w1))+c2(w2−projM(w2))=(c1w1+c2w2)−(c1projM(w1)+c2projM(w2))
is also orthogonal to M
Since c1projM(w1)+c2projM(w2)∈M, we must have c1projM(w1)+c2projM(w2)=projM(c1w1+c2w2)
The orthogonal complement of a subspace M of Rn is M⊥={w∈Rn∣w is orthogonal to M}
(read "M perp")
Example 12.4
Find the orthogonal compoenent of the plane in R3 P={⎝⎜⎛xyz⎠⎟⎞∣3x+2y−z=0}
First, find a basis for P B=⟨⎝⎜⎛103⎠⎟⎞,⎝⎜⎛012⎠⎟⎞⟩
Steps
We have z=3x+2y, so
P=⎩⎪⎨⎪⎧⎝⎜⎛103⎠⎟⎞x+⎝⎜⎛012⎠⎟⎞y∣x,y∈R⎭⎪⎬⎪⎫
A v that is orthogonal to every vector in B is orthogonal to every vector in span(B)=P
So this gives two conditions: ⎝⎜⎛103⎠⎟⎞⋅⎝⎜⎛v1v2v3⎠⎟⎞=0⎝⎜⎛012⎠⎟⎞⋅⎝⎜⎛v1v2v3⎠⎟⎞=0
This gives a linear system P⊥={⎝⎜⎛v1v2v3⎠⎟⎞∣(100132)⎝⎜⎛v1v2v3⎠⎟⎞=(00)}
we therefore must find the nullspace of the matrix P⊥={⎝⎜⎛−3−21⎠⎟⎞t∣t∈R}
For a subspace M and the orthogonal complement M⊥,
M⊥ is itself a subspace
M∩M⊥={0}
For every w∈Rn, w−projM(w)∈M⊥
The span of M⊥∪M is all of Rn
If dimension(M)=k, then dimension(M⊥)=n−k
Proofs
0∈M and 0⋅v=0 for all v∈M so 0∈M⊥ as well.
From b) from before, M⊥ is closed under vector addition and scalar multiplication. Thus, M⊥ is a subspace of Rn
From part a), the only vector in M that is orthogonal to M is 0, so M∩M⊥={0}
By definition of projM(w), w−projM(w) is orthogonal to M, so w−projM(w)∈M⊥
For any w∈Rn, we have w=(w−projM(w))+projM(w)
but w−projM(w)∈M⊥ and projM(w)∈M
First, suppose dimension(M⊥)=l
Then choose orthonormla bases BM=⟨b1,...,bk⟩ of M and BM⊥=⟨bk+1,...,bk+l⟩ of M⊥ BM spans M and BM⊥ spans M⊥⟹⟨b1,...,bk,bk+1,...,bk+l⟩ spans Rn by (3)
We consider bi⋅bj for i<j:
If j≤k, since ⟨b1,...,bk⟩ is orthonormal, bi⋅bj=0.
If k+1≤i, since ⟨bk+1,...,bk+l⟩ is orthonormal, bi⋅bj=0.
If i≤k and k+1≤j, then bi∈M and bj∈M⊥, so they are perpendicular, thus bi⋅bj=0.
So, the family {b1,...,bk,bk+1,...,bk+l} is linearly independent, nonzero, and span Rn. Therefore, k+l=n⟹l=n−k, finishing the proof.
If M is a subspace of Rn, then M is the orthogonal complement of M⊥, i.e. (M⊥)⊥=M
For every w∈Rn, w=projM(w)+projM⊥(w)
Proof
From the definition of M⊥, if v∈M then v is orthogonal to every vector in M⊥, so v∈(M⊥)⊥, and M⊆(M⊥)⊥.
Furthermore, we know that dimension(M)+dimension(M⊥)=n and dimension(M⊥)+dimension((M⊥)⊥)=n, so dimension(M)=dimension((M⊥)⊥)
With those two facts, we can conclude that M=(M⊥)⊥
For the second part, define w⊥=w−projM(w). Since w−w⊥=projM(w)∈M=(M⊥)⊥, we have that w−w⊥ is orthogonal to M⊥, with w∈M⊥, so w−w⊥=w−projM⊥(w)⟹w⊥=projM⊥(w)
Finally, w=w⊥+projM(w)=projM⊥(w)+projM(w)
Given a subspace M⊆Rn, how can we compute projM(w) of a vector w∈Rn?
We will suppose the basis for M is B=⟨b1,...,bk⟩
If B is an orthonormal basis, then we know projM(w)=(w⋅b1)b1+⋯+(w⋅bk)bk=UUTw
or equivalently RepBM(projM(w))=⎝⎜⎜⎛w⋅b1⋮w⋅bk⎠⎟⎟⎞
If B is an orthogonal basis, then ⟨∣b1∣b1,...,∣bk∣bk⟩
is orthonormal.
If B isn't orthogonal, you could use Gram-Schmidt, but we use a more convenient formula:
Let M⊆Rn be a subspace with basis ⟨b1,...,bk⟩ and let A be the matrix whose columns are the bi's. Then projM(v)=c1b1+⋯+ckbk
where the ci's are the entries of the vector (ATA)−1AT⋅v
or equivalently, projM(v)=A(ATA)−1AT⋅v
Proof
Given: ⟨b1,...,bk⟩ is the basis of M⊆Rn and A is an n×k matrix with column i being bi projM(v)∈M⟹projM(v)=c1b1+⋯+ckbk=Ac where c=⎝⎜⎜⎛c1⋮ck⎠⎟⎟⎞ v−projM(v) is orthogonal to every bi, which is every row of AT⟹AT(v−projM(v))=0 ⟹AT(v−Ac)=ATv−ATAc=0⟹c=(ATA)−1ATv (check ATA is invertible)
Thus, projM(v)=Ac=A(ATA)−1ATv
Note that (ATA)−1=A−1(AT)−1 because A is not square
Example 12.5
Project v=⎝⎜⎛1−11⎠⎟⎞ onto the plane P={⎝⎜⎛xyz⎠⎟⎞∣x+z=0}
A basis for P is ⟨⎝⎜⎛010⎠⎟⎞,⎝⎜⎛10−1⎠⎟⎞⟩ so A=⎝⎜⎛01010−1⎠⎟⎞AT=(01100−1)
Now, we simply compute A(ATA)−1ATv ATA=(1002) (ATA)−1=(1001/2) (ATA)−1AT=(01/2100−1/2) A(ATA)−1AT=⎝⎜⎛1/20−1/2010−1/201/2⎠⎟⎞
Finally projP(v)=⎝⎜⎛1/20−1/2010−1/201/2⎠⎟⎞⎝⎜⎛1−11⎠⎟⎞=⎝⎜⎛0−10⎠⎟⎞
Given a subspace M⊆Rn, the distance from w∈Rn to M is the smallest possible distance from w to a point on M
The distance from w to M is ∣w−projM(w)∣, or equivlaently, ∣w−v∣≥∣w−projM(w)∣ for all v∈M
Proof
We know w=projM(w)+projM⊥(v) ⟹w−v=(projM(w)−v)+projM⊥(v)
Since v∈M, projM(w)−v∈M, so is orthogonal to w−projM(w)=projM⊥(w)∈M⊥. Therefore, by Pythagorean Theorem, ∣w−v∣2=∣projM(w)−v∣2+∣w−projM(w)∣2
So ∣w−v∣≥∣w−projM(w)∣, with equality when projM(w)=v